Picture for Mahdieh Soleymani Baghshah

Mahdieh Soleymani Baghshah

Sharif University of Technology

The Judge Who Never Admits: Hidden Shortcuts in LLM-based Evaluation

Add code
Feb 08, 2026
Viaarxiv icon

Efficient Adversarial Attacks on High-dimensional Offline Bandits

Add code
Feb 02, 2026
Viaarxiv icon

SUSD: Structured Unsupervised Skill Discovery through State Factorization

Add code
Feb 02, 2026
Viaarxiv icon

Mechanistic Interpretability of Large-Scale Counting in LLMs through a System-2 Strategy

Add code
Jan 06, 2026
Viaarxiv icon

Infinity and Beyond: Compositional Alignment in VAR and Diffusion T2I Models

Add code
Dec 12, 2025
Figure 1 for Infinity and Beyond: Compositional Alignment in VAR and Diffusion T2I Models
Figure 2 for Infinity and Beyond: Compositional Alignment in VAR and Diffusion T2I Models
Figure 3 for Infinity and Beyond: Compositional Alignment in VAR and Diffusion T2I Models
Figure 4 for Infinity and Beyond: Compositional Alignment in VAR and Diffusion T2I Models
Viaarxiv icon

Limits and Gains of Test-Time Scaling in Vision-Language Reasoning

Add code
Dec 11, 2025
Viaarxiv icon

Persian Musical Instruments Classification Using Polyphonic Data Augmentation

Add code
Nov 07, 2025
Viaarxiv icon

The Illusion of Procedural Reasoning: Measuring Long-Horizon FSM Execution in LLMs

Add code
Nov 05, 2025
Viaarxiv icon

MEENA (PersianMMMU): Multimodal-Multilingual Educational Exams for N-level Assessment

Add code
Aug 24, 2025
Figure 1 for MEENA (PersianMMMU): Multimodal-Multilingual Educational Exams for N-level Assessment
Figure 2 for MEENA (PersianMMMU): Multimodal-Multilingual Educational Exams for N-level Assessment
Figure 3 for MEENA (PersianMMMU): Multimodal-Multilingual Educational Exams for N-level Assessment
Figure 4 for MEENA (PersianMMMU): Multimodal-Multilingual Educational Exams for N-level Assessment
Viaarxiv icon

LLM-Agent-Controller: A Universal Multi-Agent Large Language Model System as a Control Engineer

Add code
May 26, 2025
Viaarxiv icon